Isolation Scheduler Hosts Sorting | Add secondary sort (No of free slots) and tertiary sort (Hostname)#8709
Isolation Scheduler Hosts Sorting | Add secondary sort (No of free slots) and tertiary sort (Hostname)#8709meetjain74 wants to merge 1 commit into
Conversation
|
Hi @meetjain74 ! What do you think about adding some unit tests for this? Thank you |
| final Map<String, Integer> hostFreeSlotCount = new HashMap<String, Integer>(); | ||
| for (WorkerSlot slot : cluster.getAvailableSlots()) { | ||
| String host = cluster.getHost(slot.getNodeId()); | ||
| if (hostAssignableSlots.containsKey(host)) { |
There was a problem hiding this comment.
can we iterate over hostAssignableSlots in the outer loop? This will reduce the number of iterations, since hostAssignableSlots contains the target hosts
| } | ||
| } | ||
|
|
||
| List<HostAssignableSlots> sortHostAssignSlots = new ArrayList<HostAssignableSlots>(); |
There was a problem hiding this comment.
ran this through an LLM.
Suggested approach to avoid doing Map lookup in every sort iteration
1. Add freeSlots to HostAssignableSlots:
1 class HostAssignableSlots {
2 private final int freeSlots;
3 // ... update constructor and add getter
4 }
2. Calculate once per host, then sort:
1 // Build objects with pre-calculated metadata
2 for (Map.Entry<String, List<WorkerSlot>> entry : hostAssignableSlots.entrySet()) {
3 int free = (int) entry.getValue().stream().filter(cluster::isSlotAvailable).count();
4 sortHostAssignSlots.add(new HostAssignableSlots(entry.getKey(), entry.getValue(), free));
5 }
6
7 // Faster Comparator (Direct field access)
8 Collections.sort(sortHostAssignSlots, (o1, o2) -> {
9 int bySlots = o2.getWorkerSlots().size() - o1.getWorkerSlots().size();
10 if (bySlots != 0) return bySlots;
11
12 int byFree = o2.getFreeSlots() - o1.getFreeSlots();
13 if (byFree != 0) return byFree;
14
15 return o1.getHostName().compareTo(o2.getHostName());
16 });
`
There was a problem hiding this comment.
Will pick and fix this and update the PR
rzo1
left a comment
There was a problem hiding this comment.
Thanks for the PR, and +1 to @reiabreu's two points — they're both on the mark. A few things to add on top.
On the comparator refactor (@reiabreu's line 334 suggestion): it's correct and worth doing — I checked the Cluster internals and getAvailablePorts() = getAssignablePorts() − usedPorts (Cluster.java:411-418), so a host's available slots are always a subset of its assignable slots. That means counting free slots among each host's own assignable list is exactly equivalent to the current "bucket getAvailableSlots() by host" approach — safe to switch, and it lifts the map lookup out of the comparator. One gotcha: the suggested snippet uses cluster::isSlotAvailable, which doesn't exist — the method is cluster.isSlotOccupied(WorkerSlot), so the predicate needs to be slot -> !cluster.isSlotOccupied(slot) (or just keep deriving the count from getAvailableSlots() and pass it into the constructor).
On unit tests (@reiabreu's earlier ask): I'd treat this as the one blocker. There's currently no focused IsolationScheduler test anywhere in the tree, so this selection path is entirely uncovered and the value of the change is unverifiable. A small test pinning both new keys would do it:
- two hosts, equal assignable slots, different free-slot counts → assert the host with more free slots (fewer evictions) is selected first;
- two hosts equal on both → assert the lexicographic hostname tiebreak.
The tertiary hostname sort is a nice fix in its own right — the comparator's input came from HashMap.entrySet(), so equal-ranked hosts were previously ordered non-deterministically.
LGTM on the logic; I'd just want the unit test before merge. 👍
What is the purpose of the change
All things are mentioned in the Issue #8708
How was the change tested
Deployed topologies after the changes and is working as expected.